Skip to content

feat: Support QLoRA fine-tuning in post-training#3702

Draft
RexBearIU wants to merge 9 commits into
mainfrom
jackyf/qlora
Draft

feat: Support QLoRA fine-tuning in post-training#3702
RexBearIU wants to merge 9 commits into
mainfrom
jackyf/qlora

Conversation

@RexBearIU
Copy link
Copy Markdown
Collaborator

@RexBearIU RexBearIU commented Apr 20, 2026

Description

This PR extends our existing LoRA integration to support Quantized LoRA (QLoRA) using the Qwix library for our NNX-based models. It adds configuration options for quantization types and tile sizes, along with necessary logic
fixes to correctly handle quantized node metadata and parameter traversal within the NNX framework.

Key Changes

  • Config Additions: Added lora_weight_qtype (e.g., nf4) and lora_tile_size to sft.yml and the LoRA type class to enable QLoRA configurations.
  • NNX Decoder Enhancements:
    • Added metadata preservation (stash_origin_metadata and restore_origin_metadata) to correctly handle partitioning specs across scan boundaries in NNXDecoder.
    • Introduced fix_node_rank logic to dynamically adjust PartitionSpec rank constraints to match parameter shapes during scanning.
  • Qwix Provider Patching (lora_utils.py):
    • Automatically switches between LoRA and QLoRA providers based on the presence of lora_weight_qtype.
    • Implemented _patch_qwix_for_maxtext to resolve integration issues:
      • PTQ Patch: Intercepts jax.numpy.asarray to correctly handle nnx.State arrays wrapped as ptq.QArray objects.
      • Parameter Traversal Patch (find_param): Replaces flax_util.find_param with a custom implementation that accurately searches nnx.Module trees and jax.core.Tracer graphs to find the correct node references for
        quantization.

Tests

Updated lora_utils_test.py to cover the new optional QLoRA config fields.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@RexBearIU RexBearIU closed this Apr 20, 2026
@RexBearIU RexBearIU reopened this Apr 21, 2026
@RexBearIU RexBearIU force-pushed the jackyf/feat/lora-nnx branch 3 times, most recently from 669b501 to 9634d50 Compare April 21, 2026 08:48
@RexBearIU RexBearIU force-pushed the jackyf/qlora branch 2 times, most recently from a2739ab to fc27813 Compare April 22, 2026 08:01
@RexBearIU RexBearIU force-pushed the jackyf/feat/lora-nnx branch 22 times, most recently from 8531d67 to de940c3 Compare April 29, 2026 12:28
@RexBearIU RexBearIU force-pushed the jackyf/qlora branch 8 times, most recently from 2d237f8 to f8a7f1b Compare May 4, 2026 11:06
@RexBearIU RexBearIU force-pushed the jackyf/feat/lora-nnx branch from 5a7bedb to 397f319 Compare May 5, 2026 08:03
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 69.69697% with 20 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/maxtext/utils/maxtext_utils_nnx.py 66.07% 8 Missing and 11 partials ⚠️
src/maxtext/layers/nnx_decoders.py 87.50% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@RexBearIU RexBearIU changed the title Jackyf/qlora feat: Support QLoRA fine-tuning in post-training May 5, 2026
@RexBearIU RexBearIU force-pushed the jackyf/feat/lora-nnx branch 5 times, most recently from 391664f to 1eb5953 Compare May 8, 2026 11:11
@shralex shralex force-pushed the jackyf/feat/lora-nnx branch 2 times, most recently from 3954ce8 to 6face5b Compare May 8, 2026 15:01
@RexBearIU RexBearIU force-pushed the jackyf/feat/lora-nnx branch from 6face5b to eced9d7 Compare May 11, 2026 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant